DOCUMENT RESUME 



ED 407 425 



TM 026 449 



AUTHOR 
TITLE 
PUB DATE 
NOTE 



PUB TYPE 
EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Lockridge, Jewel 

Stepwise Analyses Should Never Be Used by Researchers . 

Jan 97 

lOp.; Paper presented at the Annual Meeting of the Southwest 
Educational Research Association (Austin, TX, January 23-25, 
1997) . 

Reports - Evaluative (142) -- Speeches /Meeting Papers (150) 

MFOl/PCOl Plus Postage . 

♦Computer Oriented Programs; Error of Measurement; 

♦Predictor Variables; Research Methodology; ♦Research 
Problems; ♦Sampling; ♦Statistical Analysis 
Research Replication; Statistical Package for the Social 
Sciences; ♦Stepwise Regression 



ABSTRACT 



Researchers persist in using stepwise regression in spite of 
problems with this approach. As noted by B, Thompson (1995) , three problems 
accompany the use of stepwise applications. The first is that computer 
packages may use incorrect degrees of freedom in their computations, 
resulting in a greater likelihood of obtaining a spurious statistical 
significance. In the Statistical Package for the Social Sciences, although 
all the predictor variables explained in the analysis are examined for the 
initial step, the computer package only shows the degree of freedom 
corresponding to one predictor variable. Secondly, stepwise methods do not 
identify the best variable set of a given size correctly. Finally, stepwise 
methods tend to capitalize on sampling error and tend to produce results that 
are not replicable. This problem is caused by the uniqueness of sample data 
and the fact that sampling error in a given sample is not likely to occur in 
another sample. Researchers should consider and select other available 
methods for research. (Contains one table and five references.) (SLD) 
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ABSTRACT 



Despite problems with using stepwise regression, researchers 
persist in using this analytical method. As Thompson (1995) 
noted, three problems accompany the use of stepwise applications; 
First, computer packages use incorrect degrees of freedom in 
their stepwise computations, resulting in artificially 
greater likelihood of obtaining spurious statistical 
significance. Second, stepwise methods do not correctly 
identify the best variable set of a given size. Third, 
stepwise methods tend to capitalize on sampling error and 
thus tend to yield results that are not replicable, (p. 525) 
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Stepwise Analyses Should Not Be Used Bv Researchers 



Frequent use is made of analytical procedures involving 
stepwise regression, in fact, stepwise methods are among the most 
commonly used investigative procedures (Snyder, 1991). The 
popularity enjoyed by stepwise regression among researchers may be 
due, at least in part, to the relatively uncomplicated nature of 
the procedures. The ease with which stepwise analyses can be 
conducted belies the compound and complex problems which arise 
from having conducted such studies (Beasley & Leitner, 1994). 
Beasley and Leitner (1994) registered criticism of stepwise 
regression procedures for their statistical distortions and 
misinterpretation of results. 

Despite sharp criticism (Beasley & Leitner, 1994; Snyder, 

1991; Thompson, 1995) of the use of stepwise regression analyses, 
there is no shortage of researchers who continue to rely on the 
results of this method. Perhaps researchers are not aware of 
three serious problems with the use of stepwise regression. 

As Thompson (1995) noted, three specific problems accompany 
the use of stepwise applications: 

First, computer packages use incorrect degrees of 
freedom in their stepwise computations, resulting 
in artificially greater likelihood of obtaining 
spurious statistical significance. Second, 
stepwise methods do not correctly identify the best 
variable set of a given size. Third, stepwise 
methods tend to capitalize on sampling error and 
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thus tend to yield results that are not replicable. 

(p. 525) 

Proble ms with stepwise Regression Analyses 

Problem one manifests itself in the fact that computer 
packages use incorrect degrees of freedom. A predictor variable, 
once examined, is like the sword a matador thrusts into the bull. 
The researcher who in the first step of the stepwise analysis 
records one degree of freedom when several or all predictor 
variables were actually examined is like that matador who thrusts 
the sword into the bull, decides that another area of the now 
wounded animal would be a more vulnerable target, quickly extracts 
the sword, and strikes again. As if the series of thrusts is not 
bad enough, the matador adds insult to injury by pretending that 
the first thrust never occurred. The animal's wounds are the 
result of two (or more) sword strikes, not just one as the matador 
pretends. In a given step of stepwise, the matador or researcher 
reaps the benefits of all the thrusts or degrees of freedom while 
being charged with the use of only one. 

Computer programs likewise fail to display the correct number 
of degrees of freedom for stepwise analyses. In SSPS, despite the 
fact that all predictor variables explained in the analysis were 
examined for step one, the computer package incorrectly shows only 
one degree of freedom recording only the predictor variable with 
the largest instead of the number of predictors actually 
examined ( Snyder , 1991). 

The second problem with stepwise methods is that they do not 



correctly identify the best variable set of a given size. 

Stepwise regression was one of a group of analyses used to compare 
data on Tennessee's school district report Ccirds. The focus of 
each initial study was to determine the impact of predictor 
variables on the dependent variable of student outcome. Bobbett 
and French (1993) compared the percentage of vcuriance for the 
original studies using Pearson Product Moment (PPM), Guttman's 
Partial Correlation (GPC), Stepwise Regression (Forward) (SR), cind 
the probability of the Multiple Regression (MR). The researchers 
examined three of the eight variables from the original Tennessee 
studies. One purpose of the study was to determine how the use of 
different analyses impacted conclusions. 

These findings illustrated among other things that stepwise 
does not necessarily pick the predictor set of a given size 
yielding the highest R^* A snapshot of the impact of three 
variables, A,B,C, on the dependent variable student outcome, at 
the elementary, middle, high school and system levels is found in 
Table 1. 
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Table 1 

Comparison of Outcomes for Three Analyses 



Variable A 





Elementary 

School 


Middle 

School 


High School 


System 


PPM 


26% 


24% 


28% 


33% 


GPC 


7% 


2% 


0% 


5% 


SR 


25% 


0% 


0% 


32% 



Variable B 





Elementary 

School 


Middle 

School 


High School 


System 


PPM 


19% 


28% 


19% 


27% 


GPC 


1% 


0% 


3% 


0% 


SR 


>1% 


>1% 


>1% 


>1% 



Variable C 





Elementary 

School 


Middle 

School 


High School 


System 


PPM 


21% 


26% 


30% 


31% 


GPC 


2% 


6% 


5% 


7% 


SR 


none 


minor 


large 


minor 



In the original Tennessee studies, predictors four through 
eight sometimes yielded higher R^s than the three variables in the 
Bobbett and French study. Since only three predictors were 
considered in this study, the researchers might have over 
emphasized their contribution to the impact on student outcomes. 
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It is also apparent from the table that depending on the method of 
comparison, certain predictor variables will show a higher 
correlation to the dependent variable. 

The third problem with stepwise methods is that they tend to 
capitalize on sanpling error and thus tend to yield results that 
are not replicable. The uniqueness of sample data is the cause of 
this third problem with using stepwise procedures. Sampling error 
in a given sample is not likely to occur in another sample. 
Thompson (1995) reasoned that sampling error makes stepwise 
applications a bad idea: "Sampling error is variability in sample 

data unique to that given sample and therefore cannot be 
reproduced in subsequent samples" (p. 532). Because of the 
uniqueness of sampling error, results are not replicable from one 
sample to the next making valid generalizations to the 
population, unlikely. For more valid generalizations to the 
population Huberty (1989) presented this strategy for the 
researcher who insists on using stepwise methods: 

Inferences about "best" subsets and variable 
importance to other units should be made with great 
caution. The "best" variable subset for one sample 
of units may be far from the best for other 
samples. The greater the ratio of sample size to 
number of response variables, the more reasonable 
are the implied generalizations. A large such 
ratio alone, however does not insure valid 
generalizations. Valid generalizations may be 
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obtained only to the extent that the pattern of 
response variable intercorrelations for non-design 
sample experimental units follow the pattern 
present in the design sample, (p. 63) 

As illustrated by Snyder (1991) in a set of elaborate tables, 
sampling error tends to yield results that are not replicable. 
Huberty (1989) suggested various resampling strategies such as 
bootstrapping or jackknifing as ways to get around sampling error 
problems . 

Summary 

Since stepwise methods not only fail to accomplish the goals 
set forth by researchers but also compound inaccurate findings and 
further invalidate results, other forms of analysis should be 
explored. The "simple" problems are first that computer packages 
use incorrect degrees of freedom in their stepwise computations; 
the second problem, is that stepwise procedures do not correctly 
identify the best variable set of a given size; and finally, 
stepwise methods tend to capitalize on sampling error and thus 
tend to yield results that are not replicable. These "simple" 
problems are just the beginning of what usually leads to mOre 
complex statistical abberations. To prevent these "molehills" 
from becoming mountains, the resourceful researcher should 
consider and select other available methods for research. 
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